Mutations of cryptic splice sites in cre and cre fused proteins for improvement of expression and inducibility

ABSTRACT

The invention relates to a DNA sequence, which codes for a mutant of the bacteriophage P1 recombinase, in which a particular cryptic splicing site is disabled by means of a base mutation which alters the Cre protein sequence. The invention further relates to a DNA sequence which codes for a fusion protein from the named Cre mutants and a ligand-binding domain of a receptor protein. The invention furthermore relates to vectors, micro-organisms and transgenic organisms, containing such a DNA sequence and Cre mutants or Cre fusion proteins which are coded by said DNA sequence.

[0001] The present invention relates to a DNA sequence coding for a mutant of bacteriophage P1 recombinase Cre in which a special cryptic splicing site has been eliminated by a base mutation changing the Cre protein sequence, and to a DNA sequence coding for a fusion protein derived from said Cre mutant and a ligand-binding domain of a receptor protein. The invention further relates to vectors, microorganisms and transgenic organisms which contain such a DNA sequence, and to Cre mutants or Cre fusion proteins encoded by said DNA sequences.

BACKGROUND OF THE INVENTION

[0002] Well-aimed alterations in the genome of mammals can be effected by means of site-specific recombinases. The bacteriophage P1 recombinase Cre recognizes specific base sequences (loxP sites) and is capable, depending on the position and combination of these loxP sites in the genome, of deleting, inverting, translocating or otherwise desirably mutating genes (for a survey, see Rajewsky et al., J. Clin. Invest. 98, 51-53 (1996)). For inducing the mutation event at any time desired in a selected tissue or cell type of mice, posttranslational switches for Cre activity have been developed, inter alia (Schwenk et al., Nucleic Acids Research 26, 1427-1432 (1998)). These are fusion proteins derived from Cre and the ligand-binding domain of steroid receptors (hereinafter referred to as “LBD”, for example, progesterone receptor, estrogen receptor or glucocorticoid receptor). Such Cre-LBD fusion proteins are inactive with respect to their Cre recombinase functionality, but can be activated by the addition of the corresponding ligands (posttranslationally). In cell cultures and transgenic animals, mutants of LBD were tested which prefer synthetic ligands over natural ones in order that Cre activation may not be triggered by endogenous hormones.

[0003] The application of this technology in transgenic mice using both ubiquitously active and cell-type specific promoters has been published (e.g., Schwenk et al. (1998); Kellendonk et al., J. Mol. Biol. 285, 175-182 (1999)). Although it is shown in these works that posttranslational activation of Cre through LBD functions in principle, the system has two critical disadvantages:

[0004] 1) a reduced activity of Cre-LBD as compared to non-fused Cre;

[0005] 2) Background activity of Cre-LBD, i.e., Cre recombinase activity even in the absence of the ligand.

[0006] For applications in vivo, the background activity has proven problematic, especially when this event occurs rather early in the course of development. This produces mosaic animals in which the recombination event has already occurred in part of the cells while posttranslational control is still possible in the remaining cells. It would be desirable to have a better controlled Cre regulation system which has no (measurable) background activity. This is where the present invention comes into play.

[0007] Possible causes of the background activity include:

[0008] 1) cryptic splicing of the Cre-LBD-encoding mRNA in such a way that a truncated residual protein having non-regulated Cre activity can be translated; and

[0009] 2) posttranslational degradation, e.g., by endogenous proteases which process an unregulated Cre.

SUMMARY OF THE INVENTION

[0010] Surprisingly, it has been found that these undesirable processes are eliminated when specific modifications are introduced into Cre-coding sequences, such as Cre-LBD-coding sequences:

[0011] 1) Optimization of the Cre-coding sequence with respect to the exclusion of critical splicing events;

[0012] 2) mutation or shortening of the linker region between Cre and LBD to render the accession of proteases to this region more difficult.

[0013] Thus, the present invention relates to:

[0014] (1) a DNA sequence coding for a mutant of bacteriophage P1 recombinase Cre in which the cryptic splicing site in the sequence ATG GTG CGC, which corresponds to positions 1003-1011 in the wild-type sequence shown in SEQ ID NO: 1, has been eliminated by a base mutation changing the Cre protein sequence;

[0015] (2) a DNA sequence coding for a fusion protein derived from bacteriophage P1 recombinase Cre and another functional protein, wherein the sequence encoding bacteriophage P1 recombinase Cre is defined as under (1);

[0016] (3) a preferred embodiment of the DNA sequence as defined under (2) wherein said further functional protein is a ligand-binding domain of a receptor protein;

[0017] (4) a vector containing a DNA sequence according to (1), (2) and/or (3);

[0018] (5) a microorganism or a transgenic organism containing a vector as defined in (4) and/or a DNA sequence as defined in (1), (2) and/or (3);

[0019] (6) a Cre mutant or Cre fusion protein encoded by the DNA sequence according to (1), (2) and/or (3);

[0020] (7) a process for preparing a Cre mutant or Cre fusion protein as defined under (6), comprising the culturing of a microorganism or a cell structure which has been transformed or transfected with a vector according to (4); and

[0021] (8) the use of a DNA sequence according to (2) or (3) for the mutagenesis and/or recombination of target sequences containing loxP sites in organisms.

[0022] The present invention will now be described in more detail by the attached Figures and Examples.

DESCRIPTION OF THE FIGURES

[0023]FIG. 1: Splicing pattern of selected Cre-LBD constructs

[0024] A to D) Represented are the pre-mRNAs of selected Cre-LBD constructs with cryptic splicing donors (SD) and acceptors (SA). The values stated in parentheses express relative probabilities of the utilization of these sites (“scores”). The score is a measure of how similar the splicing sites are to the consensus sequences (E). 100% correspondence with SA yields a score of 14.2, and the mean value for constitutive exons is 7.9. 100% correspondence with SD yields a score of 12.6, and the mean value for constitutive exons is 8.1. Only those SD and SA whose score is above 4 have been indicated. The scores are calculated by means of the computer program “splice site score calculation” (http://www2.1mcb.osakau.ac.jp/splice/score.html). Abbreviations used: SS, synthetic splicing substrate (Choi et al., Mol. Cell. Biol., 11, 3070-3074 (1991)); Crel9, Cre starting with amino acid 19; hCre2, humanized Cre starting with amino acid 2; PR and ER, LBD of the progesterone or estrogen receptor, respectively, the numbers denoting the amino acid position; ERT2, LBD of estrogen receptor with the amino acid substitutions G400V, M543A, L544A (Feil et al., Biochem. Biophys. Res. Commun. 237, 752-757 (1997)); bpA, polyadenylation signal of the bovine growth hormone; Y, either C or T.

[0025]FIG. 2: A cryptic splicing event of a P1CrePR650-914-encoding construct detected by RT-PCR.

[0026] Represented is the fusion region of Crel9PR650-914.

[0027] A) Sequence comparison of the cryptically spliced sequence (top) with the corresponding unspliced sequence (bottom).

[0028] B) Schematic representation of the cryptic splicing event. Important amino acids in this region are indicated. The arrows indicate the course of PCR: After the RT reaction, the sequence was amplified with the 5′-primer and the outer 3′-primer. After isolation from the agarose gel, a semi-nested PCR with the 5′-primer and the inner 3′-primer was effected.

[0029]FIG. 3: Sequence comparison of the 3′ ends of the various Cre-coding sequences with the consensus for splicing donors.

[0030] Section of the 3′ end of the Cre-coding sequence of

[0031] A) P1-Cre

[0032] B) hCre

[0033] C) P1-CreV336A; and

[0034] D) hCreV336A.

[0035] Shown is that region of the Cre sequence where the amino acid substitution has been introduced. The base sequence around V336 in P1-Cre shows a high homology with the represented consensus of a splicing donor site. The correspondence still increases in hCre, and thus there is a higher risk of cryptic splicing at this site. In the mutated sequences of P1-Cre and hCre, codon 336 (numbering scheme based on the wild-type sequence shown in SEQ ID NO: 1) was changed from GTG (Val) to GCG (Ala); due to this exchange, splicing is no longer possible since the sequence GTN can be considered essential. The base exchange from T to C is at position 1007, based on the base sequence of the wild type Cre (SEQ ID NO: 1, according to Sternberg et al., 1986).

[0036]FIG. 4: Activity test of Cre19PR676-914, hCre19PR676-914 and hCre19V336APR676-914 according to Example 6 The bars represent mean values from three different transfections each with standard deviations. White bars represent percent activities of non-mutated Cre19PR676-914 or hCre19PR676-914, and black bars represent mutated hCre19V336APR676-914. The numbers within the bars indicate the inducibility of the corresponding construct, which is calculated by dividing the percent activity in the induced state by the background. “+”: 100 nM RU486 in the medium; “−”: no inducer in the medium.

[0037]FIG. 5: Activity test of mutated P1-Cre as compared with mutated hCre in fusion with PR

[0038] The experiment was performed as described in Example 6. Represented are two independent transfection runs whose bars represent mean values from three different transfections each with standard deviations. White bars represent percent activities of the Cre19-LBD constructs relative to CMV-Cre, gray bars represent Cre19V336APR676-914, Black bars represent hCre19V336A-LBD constructs. “+”: 100 nM RU486 in the medium; “−”: no inducer in the medium.

[0039]FIG. 6: Schematic comparison of the cryptic splicing patterns of the Cre sequences employed and effects on Cre activity in the absence of the inducer.

[0040] The stated percentages refer to a comparison of the represented Cre constructs in fusion with PR676-914 relative to non-fused Cre (=100%).

[0041]FIG. 7: Comparing analysis of the inducible Cre activities of Cre*PR on which this invention is based (i.e., hCreV336A-PR650-914) with the previously published CrePR1 (Kellendonk et al., Nucl. Acid Res. 24, 1404-1411 (1996)). A) The experiment was performed as described in Example 6. Represented is the mean value of three different transfections with standard deviations. B) Dose-effect analysis: Reporter cells were transfected with the stated constructs and cultured in the presence of different concentrations of RU486.

DETAILED DESCRIPTION OF THE INVENTION

[0042] To be able to estimate the extent of cryptic splicing processes in Cre fusion proteins, patterns of potential splicing sites were established by computer analyses (FIG. 1). Within the Cre-coding sequence of bacteriophage P1 (briefly referred to as P1 hereinafter), there appears a relatively strong acceptor at the 5′ terminus as well as five donors of medium strength in the 3′ half (FIG. 1A). In the LBD-coding sequence of progesterone receptor (nucleotides 1948 to 2742 of SEQ ID NO: 3 which code for amino acids 650 to 914 of the wild type receptor, “PR” hereinafter), two acceptors of medium and one SA with extremely high strength appear (FIGS. 1A-B). In the corresponding estrogen-receptor sequence (“ER” hereinafter), the analysis yielded only two SAs of medium strength (FIG. 1D). One or more of the donor sequences positioned at the 3′ end of P1Cre could form an intron with one of the acceptors in PR or ER. By splicing this intron, a premature stop of translation would be generated due to a shift in the reading frame, resulting in a truncated and possibly constitutively active Cre protein without the regulatory portion of LBD. By RT-PCR, it could be proven that such cryptic splicing actually occurs in eukaryotic cells (see FIG. 2). The functionality of the SAs (11.2) in the PR in a physiological context has been published by Balleine et al., J. Clin. End. Meta. 84, 1370-1377 (1999)

[0043] In this publication, an alternatively spliced transcript of human progesterone receptor is described which occurs accumulated in breast cancer tumors. Normally, the cryptic splicing of a very low proportion of transgenic transcripts is of less importance. However, in the case of Cre-LBD transcripts, even a minute amount of cryptic splicing products can account for the occurrence of the Cre background activity observed.

[0044] Although the Cre sequence (F. Stewart, unpublished results) is improved in this respect by an assimilation of the codon usage since many cryptic splicing sites are eliminated by silent mutations (FIGS. 1A and B), a splicing donor site directly at the 3′ end will not be eliminated, but to the contrary, it is rather enhanced by a codon optimization (“SD (8,9)” in FIG. 1B). When the base sequence in this position is observed (FIG. 3), it becomes evident that elimination of this SD by a silent mutation is not possible: each codon coding for valine (GTN) at the amino acid position 336 of the wild type includes the two bases GT which are critical to the splicing event. A mutation has been introduced which destroys the critical GT sequence (GTG to GCG, exchange on base position 1007, based on the coding base sequence of the wild type Cre sequence, cf. Sternberg et al., 1986), putting up with a conservative amino acid exchange from valine to alanine. This mutation was introduced in both P1-Cre (FIG. 3C) and hcre (FIG. 3D); thus, in hCreV336A, all cryptic SD sites at the 3′ end have been eliminated (FIG. 3D), while in P1-CreV336A, one SD is still present (FIG. 3C).

[0045] Using a reporter cell line containing loxP-stop-loxP-lacZ (Kellendonk et al., 1999), the constructs were quantitatively examined for their inducibility, especially Cre activity, in the absence of an inducer (FIG. 4). The hCre-PR-coding construct optimized by the V336A mutation actually shows a background which almost can no longer be measured (FIG. 4C) and is lower by about a factor of 10 as compared with the non-mutated construct (FIG. 4B). Accordingly, the inducibility is improved from about ten- to fifteenfold to from 76- to 165-fold. The introduction of the mutation V336A into the P1-Cre sequence results in a background activity (FIG. 5B) which is between that of the mutated hcre construct (FIG. 5C) and that of non-mutated P1-Cre (FIG. 5A). This could be attributable to the further cryptic splicing sites contained in this construct based on P1-Cre (cf. FIG. 6C). The maximum decrease of the background is due to the construct having the fewest cryptic donor sites, namely hCreV336A-PR (cf. FIG. 6D).

[0046] Another improvement could be achieved with respect to maximum activity. The Cre-PR fusion proteins represented in FIG. 4 exhibit a maximum activity of only from 30 to 40% as compared with constitutively active non-fused Cre. From former studies in our laboratory, it was known that other variants of Cre-PR (e.g., Cre-PR650-914) exhibit a clearly higher activity of up to above 500/% of that of Cre. However, these constructs always also have a very high background (see FIG. 5D), which is why they are hardly suitable for use in a transgenic animal. By introducing the above described mutation V336A into this CrePR variant, the undesirable background can be reduced to a level which is again almost not measurable while a high activity of above 40% remains (FIG. 5E).

[0047] The construct hCreV336A-PR650-914 on which this invention is based (referred to as Cre*PR hereinafter) was compared in a direct comparison with a previously published and also optimized construct, CrePR1 (Kellendonk et al., Nucl. Acid Res. 24, 1404-1411 (1996)), under identical experimental conditions. Thus, Cre*PR exhibits a 50-fold reduced background activity as compared with CrePR1, combined with a very high increase in inducibility (FIG. 7A). In addition, dose-effect analyses have shown that Cre*PR responds to significantly lower concentrations of RU486 as compared to the previously employed CrePR1 (FIG. 7B).

[0048] In the DNA sequence according to the invention (as defined above under (1)), the base mutation changing the Cre protein sequence and eliminating the cryptic splicing site in the Cre DNA sequence is preferably a base substitution in which the codon GTG in the cryptic splicing site sequence defined above under (1) has been replaced by a codon XYZ in which X, Y and Z are independently the nucleotides A, T, C or G, provided that if X=G, then Y≠T (i.e., if Y=T, then X≠G). This means that any sequence which in this position contains the sequence GT which is not critical to the splicing process is suitable. In particular, those base sequences are preferred which result in a conservative amino acid exchange in the resulting protein sequence. Since the sequence GTG codes for Val, this means that sequences coding for aliphatic amino acids, such as alanine (XYZ is GCN), leucine (XYZ is CTN, TTA, TTG) and Ile (XYZ is ATA, ATC or ATT), are particularly preferred. Preferred DNA sequences comprise the partial sequences shown in SEQ ID NOS: 13 and 14.

[0049] In addition, in the DNA sequence according to the invention, other cryptic splicing sites present in the wild type sequence may also be eliminated by silent mutations and/or by base mutations changing the Cre protein sequence. Suitable positions for such mutations can be seen from FIGS. 1 and 6.

[0050] In a preferred embodiment of the present invention, the DNA sequence coding for the Cre mutant is also truncated at the 5′ end with respect to the Cre wild type sequence. Any truncation is suitable which does not affect the functionality of the resulting Cre proteins. However, it is preferred that the nucleotides corresponding to positions 1 to 54, preferably 1 to 15, of the wild type are truncated.

[0051] In another preferred embodiment of the present invention, the DNA sequence coding for the Cre mutant is also modified at the 5′ end. Within the meaning of the present invention, “modified” means that additional nucleobases may be added both to the 5′ end of the untruncated sequence and to the 5′ end of the Cre DNA sequence truncated as defined above. These additional nucleobases are codons coding for amino acids which are supposed to have a stabilizing effect on the total protein according to the so-called N-end rule (Levy, F. et al., Proc. Natl. Acad. Sci. USA 93, 4907-4912 (1996)). Such stabilizing amino acid sequences essentially consist of neutral amino acids, preferably Met, Gly, Val and Ala. Both sequences which contain combinations of these amino acids as well as homogeneous sequences of one of the amino acids mentioned are possible. The length of the stabilizing amino acid sequences is preferably from 1 to 10 amino acids (corresponding to from 3 to 30 nucleobases), especially from 1 to 3 amino acids. In a particularly preferred embodiment, the DNA sequence contains the sequence ATG GGC GCC (coding for Met-Gly-Ala) starting immediately at the translation origin.

[0052] In a preferred embodiment, the DNA sequence according to the invention comprises the nucleobases 1 to 984 of SEQ ID NOS: 5, 7 or 9.

[0053] In another preferred embodiment, the DNA sequence is truncated at the 3′ terminus with respect to the wild type sequence. According to the present invention, from 1 to 24 nucleotides, based on the wild type sequence shown in SEQ ID NO: 1, may be truncated (i.e., truncation behind the mutation changing the Cre protein sequence which is present in position 1006-1008). Deletions of single codons downstream from the codon 1006-1008 are also possible according to the present invention.

[0054] In the DNA sequence coding for a fusion protein, the functional protein is selected from the group of proteins consisting of enzymes, peptide hormones, pharmacologically active peptides and structural proteins. The DNA sequences which code for the functional proteins may be positioned both downstream or upstream from the DNA sequence coding for the Cre mutant. However, the stabilizing effect of the Cre mutant sequence according to the invention is manifested especially if the DNA sequence coding for the functional protein is downstream from the Cre DNA sequence, i.e., is linked to the 3′ end of the Cre DNA sequence.

[0055] Preferably, the functional protein is the ligand-binding domain of a receptor protein, especially a ligand-binding domain of a steroid receptor, such as progesterone, estrogen or glucocorticoid receptor. These DNA sequences can be derived from any mammals, preferably from humans, mice or rats. More preferred is a ligand-binding domain of a steroid receptor, especially one comprising the nucleotides 1948 to 2742 of the human sequence shown in SEQ ID NO: 3.

[0056] The above defined preferred DNA sequences coding for fusion proteins may contain further fusion partners (DNA sequences coding for functional proteins), e.g., those which enable easier purification (such as glutathione-S-transferase, GST, or maltose-binding protein, MBP), detection (e.g., green fluorescent protein, GFP) or transduction (e.g., transduction domains of HIV-TAT or VP22). In addition, the present invention also relates to DNA sequences coding for fusion proteins which comprise the Cre sequence according to the invention and the further fusion partners mentioned.

[0057] Further, it is preferred that the linker sequence in the fusion protein between the Cre domain and the DNA sequence coding for the functional protein, especially for the ligand-binding domain, be as short as possible and/or mutated to impede the access for proteases. For the DNA sequences coding for these fusion proteins, this means that if the DNA coding for the functional protein (or the ligand-binding domain) is downstream from the Cre DNA sequence, then the region behind the mutated cryptic splicing site (at positions 1006-1008, based on the wild type sequence) to the end of the Cre sequence or to the functional start of the ligand-binding domain as defined above may be wholly or partially deleted and/or mutated. The term “mutated” also includes the case that the original DNA sequence is replaced by a short DNA sequence which codes for an amino acid sequence which essentially consists of neutral amino acids, such as Gly, Ala, Val, Pro or Ile, especially Gly and Pro. In particular, the length of this amino acid sequence is within a range of from 1 to 10, preferably from 1 to 5, amino acids.

[0058] Preferably, the splicing sites occurring within the sequence coding for the ligand-binding domain have also been eliminated by silent mutation or by a base mutation changing the protein sequence of the ligand-binding domain. More preferred sequences are the DNA sequences shown in SEQ ID NOS: 5, 7 and 9.

[0059] The vector according to the invention as defined above under (3) may contain additional functional sequences, such as the sequences defined above which enable purification, detection or transduction, in addition to the DNA sequence (1) and/or (2). Further, it may also contain promoter sequences which enable expression, e.g., in bacteria or yeasts or lepidoptera cells, or expression in transgenic organisms as defined above. In addition, the vector may include replication origins, selectable markers and multiple cloning sites and the like.

[0060] The microorganisms or transgenic organisms according to the invention may both be transformed with the vector according to the invention as defined above and have the DNA sequence according to the invention integrated in their chromosomes. “Microorganisms” according to the present invention includes both prokaryotes and eukaryotes. “Organisms” according to the present invention includes vertebrates, especially mammals (including human and non-human mammals, more especially rodents such as mice or rats) or fishes, or invertebrates, such as worms and flies.

[0061] In addition to the above defined sequences, the Cre mutant according to the invention or the Cre fusion protein according to the invention may also contain other proteinogenic structures or be linked to non-proteinogenic structures.

[0062] The process according to the invention for preparing the Cre mutant or the Cre fusion protein may be effected, as defined above, by culturing a microorganism transformed with a suitable vector. The target proteins may then be isolated and purified from the culture by methods known to the skilled person.

[0063] The invention is of great importance to experiments (both in vivo and in vitro) which relate to the conditional expression of genes. The sequence according to the invention is useful, in particular, if no background activity of the induction is to be tolerated. For example, upon application of a former Cre-LBD system, it was reported that the inducible expression of a reporter gene was limited by spontaneous expression. This background activation is explained by a ligand-independent recombinase activity of Cre-LBD (Fuhrmann-Benzakein et al., Nucl. Acids Res. 28, e99 (2000)). The application of a construct according to the invention can overcome such limitation. The generation and analysis of Cre*PR-transgenic mice is currently under research.

[0064] The invention is further of particular importance to the further development of inducible conditional gene targeting. “Conditional gene targeting” means the selective mutagenesis of a gene in a freely selectable cell type or tissue; also, the mutagenesis may optionally be effected at any time desired using an inducible system. These modifications of classical gene targeting are important especially when the mutation already occurring in germ cells exhibits undesirable side-effects. Using the invention, transgenic animals, preferably mice, can be prepared which express a posttranslationally inducible Cre fusion protein. Upon crossing with a murine line which carries a loxP-containing target sequence, selected changes in the genome, such as deletions, inversions or translocations, can be effected at any time desired by adding an inducer. One particular advantage of the invention resides in the low Cre recombinase activity described which prevails prior to the addition of the inducer. Thus, an undesirable (i.e., occurring prior to the addition of the inducer) modification of the genotype is not to be expected in the transgenic animal. To date, such Cre background activity has rendered it difficult to unambiguously establish the genotype before or after induction. Depending on the specificity of the promoter used in the transgene, the expression of the inducible Cre can be effected either in a cell-type or tissue-specific was or ubiquitously. In the latter case, a transgenic animal is obtained which allows the temporal control over mutagenesis in any type of tissue or cells. A control with respect to time and space is enabled by using cell-type specific promoters. The transgenic constructs can be either used by microinjection for generating transgenic lines, or integrated into selected regions of the genome by homologous recombination.

[0065] The present invention will now be explained in more detail by the following Examples.

EXAMPLES

[0066] In the Examples described below, the following primers and templates were used: Primers: 3Bsicre343 AAATTCGTACGCATCGCCATCTTCCAGCAGG 3Narcre343 AAATTGGCGCCATCGCCATCTTCCAGCAGG 3Kascre343 AATTTGGCGCCGTCCCCATCCTCGAGCAG 3Sfomutcre343 AAATTGGCGCCATCGCCATCTTCCAGCAGGCG CGCCATTGCCCC 3Sfomutsshcre343 AAATTGGCGCCGTCCCCATCCTCGAGCAGCCT CGCCATGGCCCC 5Bamhcre2 GGGGGATCCACCATGGGTGCCTCCAACCTGCT GACTGTG 5Bamhcre19 TTTAAGGATCCACCATGGGTGCCACGAGTGAT GAGGTTCGCA 5Bsicre19 TTTAACGTACGGCACGAGTGATGAGGTTCGCA 5Narcre19 TTTAAGGCGCCACGAGTGATGAGGTTCGCA Templates: pGKCrebpA Institut für Genetik, UniversitÄt zu Köln. pBIuehCre Seeburg, MPI für med. Forschung, Heidelberg, and Stewart, EMBL, Heidelberg. pNNCre19PR676-914 Institut für Genetik, Universität zu Köln. pNNhCre19PR676-914 Institut für Genetik, Universität zu Köln. pNN265E-bpA Artemis, Cologne.

Example 1 Cloning Step I: LBD in pNN

[0067] In the first cloning step, the various lengths of the LBD of PR and of ER were cloned into pNN265E-bpA (obtainable from the Institut für Genetik, Univ. zu Köln, Germany). The cloning was effected through the restriction enzymes BamHI and EcoRV which cut only once in the vector and which were also introduced into the amplificates by PCR.

[0068] Various lengths of the LBD of PR were generated by 5′- and 3′-terminal truncations. The 5′ truncations began at amino acids 650, 676 and 678 of PR. By beginning the PR-LBD at Leu650, the D domain of the LBD is shortened by 10 amino acids. This N terminus was selected since such constructs have high activities in the context with Cre, as known from former experience (Kellendonk et. al., Nucl. Acid Res. 24, 1404-1411 (1996)). A second N-terminal truncation was effected up to Ser676 of PR. This was based on the result found by Kellendonk et al. (1996) and consideration resulting therefrom: The less amino acids of the D domain of PR there are as linkers between Cre and LBD, the lower the background activity of the fusion proteins. Inter alia, the amino acids 641, 672 and 687 were tested as starting points for LBD (Kellendonk et al., 1996). PR641 showed a high activity in the context with Cre, but also a high background. In contrast, PR672 showed a little less activity and also less background as compared to PR641. PR687 showed hardly any activity at all. Therefore, the starting point of the Ser676 construct was chosen between amino acids 672 and 687, with the expectation that a high activity of Cre is retained and the background also remains on a tolerable low level. In addition, in Ser676 constructs, in contrast to Leu650, two strong splicing acceptors (SA) for possible cryptic splicing have been modified on the sequence level. One SA is eliminated by deleting the sequence. The second SA is changed in its consensus sequence in such a way that the high C/T-rich sequence segment is lacking in this construct. In addition, in the linker region, it has been taken care that possible protease recognition amino acids, such as Arg, Lys, are deleted. In the third 5′ truncation variant Gly678, the remaining sequence of the second SA, which is still present in Ser676, was deleted. In order that the Cre activity in such constructs is not limited any further by shortening the linker, two glycines, which are also intended to enhance the flexibility of the linker, were substituted for Ser 676 and Pro677.

Example 2 Cloning of Cre19 to the N or C Terminus of the LBD

[0069] The non-directed cloning of Cre19 to the N terminus of the LBD was effected through the KasI restriction site. The insert amplified on the template pGKCrebpA (Table 1) was digested with KasI and ligated into KasI-restricted and then dephosphorylated pNN vectors with LBD. In addition, Cre19 was cloned in a non-directed way to the C-terminal end of the LBD through the BsiWI restriction site of the pNN vectors with the various LBDs of PR and ER. Thus, the Crel9 sequence was amplified by PCR on the template pGKCrebpA with the primers represented in Table 1, restricted with BsiWI and ligated into BsiWI-restricted, dephosphorylated pNN vectors. TABLE I Primer sequences employed 5′-Primer 3′-Primer Template N-terminal Cre19 5Narcre19 3Narcre343 pGKCrebpA c-terminal Cre19 5Bsicre19 3Bsicre343 pGKCreboA

Example 3 Cloning of hCre to the N Terminus of PR676-914

[0070] Since Cre is a phage protein and its sequence is therefore adapted to the prokaryotic translation machinery, it was attempted to achieve an optimization for the eukaryotic expression system. R. Sprengel optimized the sequence of Cre accordingly in cooperation with F. Stewart to thus obtain an improved translation efficiency. He called the thus altered Cre sequence “humanized Cre” (hCre) (F. Stewart, unpublished results). After it was found that the construct pNNCre19PR676-914 exhibited the best Cre activities to date in a cell culture, an hCre was inserted in this construct in place of the normal Cre19.

[0071] The sequences of hCre2 and hCre19 were respectively cloned to the N terminus of PR676914. PCR of hCre19 and hCre2 was effected on the template pBluehCre with the primers stated in Table 2, the amplificates were restricted with BamHI/KasI and ligated into a BamHI/KasI-restricted pNNCre19PR676-914. TABLE 2 Primer sequences employed 5′-Primer 3′-Primer Template N-terminal hCre2 5Bamhcre2 3Kascre343 pBIuehCre N-terminal hCre19 5Bamhcre19 3Kascre343 pBluehCre

Example 4 Cloning of Cre19V336A

[0072] The mutation V336A was introduced into the phage Cre19 sequence. Thus, the Cre sequence comprising mutation, referred to as Cre19V336A in the following, was clone to the PR676-914 sequence.

[0073] The mutation V336A was introduced into the Cre19 sequence through the 3′ primer stated in Table 3. The amplificate was amplified on the template pNNCre19PR676-914 and cloned in a non-directed way through its SfoI restriction sites into an SfoI-restricted and subsequently dephosphorylated pNNCre19PR676914. By digestion with BamHI, the right orientation of Cre19V336A within the vector could be shown. The mutation was detected by sequencing. TABLE 3 Primer sequences employed 5′-Primer 3′-Primer Template Cre19V336A 5Narcre19 3Sfomutcre343 pNNCre19PR676-914

Example 5 Cloning of hCre19V336A

[0074] The mutation V336A was introduced into the sequence of hCre19 through the 3′-primer stated in Table 4. Through the BamHI and SfoI cloning sites introduced in the template pNNhCre19PR676-914 by PCR, hCre19V336A was cloned into a BamHI/SfoI-restricted pNNCre19PR676-914. The mutation was detected by sequencing. TABLE 4 5′-Primer 13′-Primer Template hCre19V336A 5Bamhhcre19 3Sfomutsshcre343 pNNhCre19- PR676-914

Example 6 Activity test on Cre19-PR676-914, hCre19-PR676-914 and hCre19V336A-PR676-914

[0075] CV1-5B cells sown on the previous day into wells of 6-well plates (8×10⁴ cells/6well/transfection) were transfected in triplicate with 1 μg of the respective fusion protein plasmids and 1 μg of pHD2-AP. Twelve hours after the transfection, the cells were trypsinated off and divided in equal portions to 2 wells of a 12-well plate. To one well was added 100 nM RU486 (obtainable from Sigma Chemie GmbH, Deisenhofen, Germany) in medium, and empty medium was added to the other. Three days after the transfection, the cells were fixed and stained against β-galactosidase and alkaline phosphatase (for normalization with respect to transfection efficiency). The percent comparison was effected against pNN-CMV-Cre transfected in parallel (Cre without a fusion partner in the same vector). Thus, the mean value of CMV-Cre was set to 100%, and the mean values for the fusion proteins were compared with this. The result is summarized in FIG. 4.

1 24 1 1032 DNA Bacteriophage P1 CDS (1)..(1029) 1 atg tcc aat tta ctg acc gta cac caa aat ttg cct gca tta ccg gtc 48 Met Ser Asn Leu Leu Thr Val His Gln Asn Leu Pro Ala Leu Pro Val 1 5 10 15 gat gca acg agt gat gag gtt cgc aag aac ctg atg gac atg ttc agg 96 Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 20 25 30 gat cgc cag gcg ttt tct gag cat acc tgg aaa atg ctt ctg tcc gtt 144 Asp Arg Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 35 40 45 tgc cgg tcg tgg gcg gca tgg tgc aag ttg aat aac cgg aaa tgg ttt 192 Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 50 55 60 ccc gca gaa cct gaa gat gtt cgc gat tat ctt cta tat ctt cag gcg 240 Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln Ala 65 70 75 80 cgc ggt ctg gca gta aaa act atc cag caa cat ttg ggc cag cta aac 288 Arg Gly Leu Ala Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu Asn 85 90 95 atg ctt cat cgt cgg tcc ggg ctg cca cga cca agt gac agc aat gct 336 Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala 100 105 110 gtt tca ctg gtt atg cgg cgg atc cga aaa gaa aac gtt gat gcc ggt 384 Val Ser Leu Val Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala Gly 115 120 125 gaa cgt gca aaa cag gct cta gcg ttc gaa cgc act gat ttc gac cag 432 Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gln 130 135 140 gtt cgt tca ctc atg gaa aat agc gat cgc tgc cag gat ata cgt aat 480 Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg Asn 145 150 155 160 ctg gca ttt ctg ggg att gct tat aac acc ctg tta cgt ata gcc gaa 528 Leu Ala Phe Leu Gly Ile Ala Tyr Asn Thr Leu Leu Arg Ile Ala Glu 165 170 175 att gcc agg atc agg gtt aaa gat atc tca cgt act gac ggt ggg aga 576 Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly Arg 180 185 190 atg tta atc cat att ggc aga acg aaa acg ctg gtt agc acc gca ggt 624 Met Leu Ile His Ile Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 195 200 205 gta gag aag gca ctt agc ctg ggg gta act aaa ctg gtc gag cga tgg 672 Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 210 215 220 att tcc gtc tct ggt gta gct gat gat ccg aat aac tac ctg ttt tgc 720 Ile Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 225 230 235 240 cgg gtc aga aaa aat ggt gtt gcc gcg cca tct gcc acc agc cag cta 768 Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gln Leu 245 250 255 tca act cgc gcc ctg gaa ggg att ttt gaa gca act cat cga ttg att 816 Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu Ala Thr His Arg Leu Ile 260 265 270 tac ggc gct aag gat gac tct ggt cag aga tac ctg gcc tgg tct gga 864 Tyr Gly Ala Lys Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser Gly 275 280 285 cac agt gcc cgt gtc gga gcc gcg cga gat atg gcc cgc gct gga gtt 912 His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 290 295 300 tca ata ccg gag atc atg caa gct ggt ggc tgg acc aat gta aat att 960 Ser Ile Pro Glu Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn Ile 305 310 315 320 gtc atg aac tat atc cgt aac ctg gat agt gaa aca ggg gca atg gtg 1008 Val Met Asn Tyr Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 325 330 335 cgc ctg ctg gaa gat ggc gat tag 1032 Arg Leu Leu Glu Asp Gly Asp 340 2 343 PRT Bacteriophage P1 2 Met Ser Asn Leu Leu Thr Val His Gln Asn Leu Pro Ala Leu Pro Val 1 5 10 15 Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 20 25 30 Asp Arg Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 35 40 45 Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 50 55 60 Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln Ala 65 70 75 80 Arg Gly Leu Ala Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu Asn 85 90 95 Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala 100 105 110 Val Ser Leu Val Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala Gly 115 120 125 Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gln 130 135 140 Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg Asn 145 150 155 160 Leu Ala Phe Leu Gly Ile Ala Tyr Asn Thr Leu Leu Arg Ile Ala Glu 165 170 175 Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly Arg 180 185 190 Met Leu Ile His Ile Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 195 200 205 Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 210 215 220 Ile Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 225 230 235 240 Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gln Leu 245 250 255 Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu Ala Thr His Arg Leu Ile 260 265 270 Tyr Gly Ala Lys Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser Gly 275 280 285 His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 290 295 300 Ser Ile Pro Glu Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn Ile 305 310 315 320 Val Met Asn Tyr Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 325 330 335 Arg Leu Leu Glu Asp Gly Asp 340 3 2802 DNA Homo sapiens CDS (1)..(2799) 3 atg act gag ctg aag gca aag ggt ccc cgg gct ccc cac gtg gcg ggc 48 Met Thr Glu Leu Lys Ala Lys Gly Pro Arg Ala Pro His Val Ala Gly 1 5 10 15 ggc ccg ccc tcc ccc gag gtc gga tcc cca ctg ctg tgt cgc cca gcc 96 Gly Pro Pro Ser Pro Glu Val Gly Ser Pro Leu Leu Cys Arg Pro Ala 20 25 30 gca ggt ccg ttc ccg ggg agc cag acc tcg gac acc ttg cct gaa gtt 144 Ala Gly Pro Phe Pro Gly Ser Gln Thr Ser Asp Thr Leu Pro Glu Val 35 40 45 tcg gcc ata cct atc tcc ctg gac ggg cta ctc ttc cct cgg ccc tgc 192 Ser Ala Ile Pro Ile Ser Leu Asp Gly Leu Leu Phe Pro Arg Pro Cys 50 55 60 cag gga cag gac ccc tcc gac gaa aag acg cag gac cag cag tcg ctg 240 Gln Gly Gln Asp Pro Ser Asp Glu Lys Thr Gln Asp Gln Gln Ser Leu 65 70 75 80 tcg gac gtg gag ggc gca tat tcc aga gct gaa gct aca agg ggt gct 288 Ser Asp Val Glu Gly Ala Tyr Ser Arg Ala Glu Ala Thr Arg Gly Ala 85 90 95 gga ggc agc agt tct agt ccc cca gaa aag gac agc gga ctg ctg gac 336 Gly Gly Ser Ser Ser Ser Pro Pro Glu Lys Asp Ser Gly Leu Leu Asp 100 105 110 agt gtc ttg gac act ctg ttg gcg ccc tca ggt ccc ggg cag agc caa 384 Ser Val Leu Asp Thr Leu Leu Ala Pro Ser Gly Pro Gly Gln Ser Gln 115 120 125 ccc agc cct ccc gcc tgc gag gtc acc agc tct tgg tgc ctg ttt ggc 432 Pro Ser Pro Pro Ala Cys Glu Val Thr Ser Ser Trp Cys Leu Phe Gly 130 135 140 ccc gaa ctt ccc gaa gat cca ccg gct gcc ccc gcc acc cag cgg gtg 480 Pro Glu Leu Pro Glu Asp Pro Pro Ala Ala Pro Ala Thr Gln Arg Val 145 150 155 160 ttg tcc ccg ctc atg agc cgg tcc ggg tgc aag gtt gga gac agc tcc 528 Leu Ser Pro Leu Met Ser Arg Ser Gly Cys Lys Val Gly Asp Ser Ser 165 170 175 ggg acg gca gct gcc cat aaa gtg ctg ccc cgg ggc ctg tca cca gcc 576 Gly Thr Ala Ala Ala His Lys Val Leu Pro Arg Gly Leu Ser Pro Ala 180 185 190 cgg cag ctg ctg ctc ccg gcc tct gag agc cct cac tgg tcc ggg gcc 624 Arg Gln Leu Leu Leu Pro Ala Ser Glu Ser Pro His Trp Ser Gly Ala 195 200 205 cca gtg aag ccg tct ccg cag gcc gct gcg gtg gag gtt gag gag gag 672 Pro Val Lys Pro Ser Pro Gln Ala Ala Ala Val Glu Val Glu Glu Glu 210 215 220 gat ggc tct gag tcc gag gag tct gcg ggt ccg ctt ctg aag ggc aaa 720 Asp Gly Ser Glu Ser Glu Glu Ser Ala Gly Pro Leu Leu Lys Gly Lys 225 230 235 240 cct cgg gct ctg ggt ggc gcg gcg gct gga gga gga gcc gcg gct gtc 768 Pro Arg Ala Leu Gly Gly Ala Ala Ala Gly Gly Gly Ala Ala Ala Val 245 250 255 ccg ccg ggg gcg gca gca gga ggc gtc gcc ctg gtc ccc aag gaa gat 816 Pro Pro Gly Ala Ala Ala Gly Gly Val Ala Leu Val Pro Lys Glu Asp 260 265 270 tcc cgc ttc tca gcg ccc agg gtc gcc ctg gtg gag cag gac gcg ccg 864 Ser Arg Phe Ser Ala Pro Arg Val Ala Leu Val Glu Gln Asp Ala Pro 275 280 285 atg gcg ccc ggg cgc tcc ccg ctg gcc acc acg gtg atg gat ttc atc 912 Met Ala Pro Gly Arg Ser Pro Leu Ala Thr Thr Val Met Asp Phe Ile 290 295 300 cac gtg cct atc ctg cct ctc aat cac gcc tta ttg gca gcc cgc act 960 His Val Pro Ile Leu Pro Leu Asn His Ala Leu Leu Ala Ala Arg Thr 305 310 315 320 cgg cag ctg ctg gaa gac gaa agt tac gac ggc ggg gcc ggg gct gcc 1008 Arg Gln Leu Leu Glu Asp Glu Ser Tyr Asp Gly Gly Ala Gly Ala Ala 325 330 335 agc gcc ttt gcc ccg ccg cgg agt tca ccc tgt gcc tcg tcc acc ccg 1056 Ser Ala Phe Ala Pro Pro Arg Ser Ser Pro Cys Ala Ser Ser Thr Pro 340 345 350 gtc gct gta ggc gac ttc ccc gac tgc gcg tac ccg ccc gac gcc gag 1104 Val Ala Val Gly Asp Phe Pro Asp Cys Ala Tyr Pro Pro Asp Ala Glu 355 360 365 ccc aag gac gac gcg tac cct ctc tat agc gac ttc cag ccg ccc gct 1152 Pro Lys Asp Asp Ala Tyr Pro Leu Tyr Ser Asp Phe Gln Pro Pro Ala 370 375 380 cta aag ata aag gag gag gag gaa ggc gcg gag gcc tcc gcg cgc tcc 1200 Leu Lys Ile Lys Glu Glu Glu Glu Gly Ala Glu Ala Ser Ala Arg Ser 385 390 395 400 ccg cgt tcc tac ctt gtg gcc ggt gcc aac ccc gca gcc ttc ccg gat 1248 Pro Arg Ser Tyr Leu Val Ala Gly Ala Asn Pro Ala Ala Phe Pro Asp 405 410 415 ttc ccg ttg ggg cca ccg ccc ccg ctg ccg ccg cga gcg acc cca tcc 1296 Phe Pro Leu Gly Pro Pro Pro Pro Leu Pro Pro Arg Ala Thr Pro Ser 420 425 430 aga ccc ggg gaa gcg gcg gtg acg gcc gca ccc gcc agt gcc tca gtc 1344 Arg Pro Gly Glu Ala Ala Val Thr Ala Ala Pro Ala Ser Ala Ser Val 435 440 445 tcg tct gcg tcc tcc tcg ggg tcg acc ctg gag tgc atc ctg tac aaa 1392 Ser Ser Ala Ser Ser Ser Gly Ser Thr Leu Glu Cys Ile Leu Tyr Lys 450 455 460 gcg gag ggc gcg ccg ccc cag cag ggc ccg ttc gcg ccg ccg ccc tgc 1440 Ala Glu Gly Ala Pro Pro Gln Gln Gly Pro Phe Ala Pro Pro Pro Cys 465 470 475 480 aag gcg ccg ggc gcg agc ggc tgc ctg ctc ccg cgg gac ggc ctg ccc 1488 Lys Ala Pro Gly Ala Ser Gly Cys Leu Leu Pro Arg Asp Gly Leu Pro 485 490 495 tcc acc tcc gcc tct gcc gcc gcc gcc ggg gcg gcc ccc gcg ctc tac 1536 Ser Thr Ser Ala Ser Ala Ala Ala Ala Gly Ala Ala Pro Ala Leu Tyr 500 505 510 cct gca ctc ggc ctc aac ggg ctc ccg cag ctc ggc tac cag gcc gcc 1584 Pro Ala Leu Gly Leu Asn Gly Leu Pro Gln Leu Gly Tyr Gln Ala Ala 515 520 525 gtg ctc aag gag ggc ctg ccg cag gtc tac ccg ccc tat ctc aac tac 1632 Val Leu Lys Glu Gly Leu Pro Gln Val Tyr Pro Pro Tyr Leu Asn Tyr 530 535 540 ctg agg ccg gat tca gaa gcc agc cag agc cca caa tac agc ttc gag 1680 Leu Arg Pro Asp Ser Glu Ala Ser Gln Ser Pro Gln Tyr Ser Phe Glu 545 550 555 560 tca tta cct cag aag att tgt tta atc tgt ggg gat gaa gca tca ggc 1728 Ser Leu Pro Gln Lys Ile Cys Leu Ile Cys Gly Asp Glu Ala Ser Gly 565 570 575 tgt cat tat ggt gtc ctt acc tgt ggg agc tgt aag gtc ttc ttt aag 1776 Cys His Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys 580 585 590 agg gca atg gaa ggg cag cac aac tac tta tgt gct gga aga aat gac 1824 Arg Ala Met Glu Gly Gln His Asn Tyr Leu Cys Ala Gly Arg Asn Asp 595 600 605 tgc atc gtt gat aaa atc cgc aga aaa aac tgc cca gca tgt cgc ctt 1872 Cys Ile Val Asp Lys Ile Arg Arg Lys Asn Cys Pro Ala Cys Arg Leu 610 615 620 aga aag tgc tgt cag gct ggc atg gtc ctt gga ggt cga aaa ttt aaa 1920 Arg Lys Cys Cys Gln Ala Gly Met Val Leu Gly Gly Arg Lys Phe Lys 625 630 635 640 aag ttc aat aaa gtc aga gtt gtg aga gca ctg gat gct gtt gct ctc 1968 Lys Phe Asn Lys Val Arg Val Val Arg Ala Leu Asp Ala Val Ala Leu 645 650 655 cca cag cca gtg ggc gtt cca aat gaa agc caa gcc cta agc cag aga 2016 Pro Gln Pro Val Gly Val Pro Asn Glu Ser Gln Ala Leu Ser Gln Arg 660 665 670 ttc act ttt tca cca ggt caa gac ata cag ttg att cca cca ctg atc 2064 Phe Thr Phe Ser Pro Gly Gln Asp Ile Gln Leu Ile Pro Pro Leu Ile 675 680 685 aac ctg tta atg agc att gaa cca gat gtg atc tat gca gga cat gac 2112 Asn Leu Leu Met Ser Ile Glu Pro Asp Val Ile Tyr Ala Gly His Asp 690 695 700 aac aca aaa cct gac acc tcc agt tct ttg ctg aca agt ctt aat caa 2160 Asn Thr Lys Pro Asp Thr Ser Ser Ser Leu Leu Thr Ser Leu Asn Gln 705 710 715 720 cta ggc gag agg caa ctt ctt tca gta gtc aag tgg tct aaa tca ttg 2208 Leu Gly Glu Arg Gln Leu Leu Ser Val Val Lys Trp Ser Lys Ser Leu 725 730 735 cca ggt ttt cga aac tta cat att gat gac cag ata act ctc att cag 2256 Pro Gly Phe Arg Asn Leu His Ile Asp Asp Gln Ile Thr Leu Ile Gln 740 745 750 tat tct tgg atg agc tta atg gtg ttt ggt cta gga tgg aga tcc tac 2304 Tyr Ser Trp Met Ser Leu Met Val Phe Gly Leu Gly Trp Arg Ser Tyr 755 760 765 aaa cac gtc agt ggg cag atg ctg tat ttt gca cct gat cta ata cta 2352 Lys His Val Ser Gly Gln Met Leu Tyr Phe Ala Pro Asp Leu Ile Leu 770 775 780 aat gaa cag cgg atg aaa gaa tca tca ttc tat tca tta tgc ctt acc 2400 Asn Glu Gln Arg Met Lys Glu Ser Ser Phe Tyr Ser Leu Cys Leu Thr 785 790 795 800 atg tgg cag atc cca cag gag ttt gtc aag ctt caa gtt agc caa gaa 2448 Met Trp Gln Ile Pro Gln Glu Phe Val Lys Leu Gln Val Ser Gln Glu 805 810 815 gag ttc ctc tgt atg aaa gta ttg tta ctt ctt aat aca att cct ttg 2496 Glu Phe Leu Cys Met Lys Val Leu Leu Leu Leu Asn Thr Ile Pro Leu 820 825 830 gaa ggg cta cga agt caa acc cag ttt gag gag atg agg tca agc tac 2544 Glu Gly Leu Arg Ser Gln Thr Gln Phe Glu Glu Met Arg Ser Ser Tyr 835 840 845 att aga gag ctc atc aag gca att ggt ttg agg caa aaa gga gtt gtg 2592 Ile Arg Glu Leu Ile Lys Ala Ile Gly Leu Arg Gln Lys Gly Val Val 850 855 860 tcg agc tca cag cgt ttc tat caa ctt aca aaa ctt ctt gat aac ttg 2640 Ser Ser Ser Gln Arg Phe Tyr Gln Leu Thr Lys Leu Leu Asp Asn Leu 865 870 875 880 cat gat ctt gtc aaa caa ctt cat ctg tac tgc ttg aat aca ttt atc 2688 His Asp Leu Val Lys Gln Leu His Leu Tyr Cys Leu Asn Thr Phe Ile 885 890 895 cag tcc cgg gca ctg agt gtt gaa ttt cca gaa atg atg tct gaa gtt 2736 Gln Ser Arg Ala Leu Ser Val Glu Phe Pro Glu Met Met Ser Glu Val 900 905 910 att gct gca caa tta ccc aag ata ttg gca ggg atg gtg aaa ccc ctt 2784 Ile Ala Ala Gln Leu Pro Lys Ile Leu Ala Gly Met Val Lys Pro Leu 915 920 925 ctc ttt cat aaa aag tga 2802 Leu Phe His Lys Lys 930 4 933 PRT Homo sapiens 4 Met Thr Glu Leu Lys Ala Lys Gly Pro Arg Ala Pro His Val Ala Gly 1 5 10 15 Gly Pro Pro Ser Pro Glu Val Gly Ser Pro Leu Leu Cys Arg Pro Ala 20 25 30 Ala Gly Pro Phe Pro Gly Ser Gln Thr Ser Asp Thr Leu Pro Glu Val 35 40 45 Ser Ala Ile Pro Ile Ser Leu Asp Gly Leu Leu Phe Pro Arg Pro Cys 50 55 60 Gln Gly Gln Asp Pro Ser Asp Glu Lys Thr Gln Asp Gln Gln Ser Leu 65 70 75 80 Ser Asp Val Glu Gly Ala Tyr Ser Arg Ala Glu Ala Thr Arg Gly Ala 85 90 95 Gly Gly Ser Ser Ser Ser Pro Pro Glu Lys Asp Ser Gly Leu Leu Asp 100 105 110 Ser Val Leu Asp Thr Leu Leu Ala Pro Ser Gly Pro Gly Gln Ser Gln 115 120 125 Pro Ser Pro Pro Ala Cys Glu Val Thr Ser Ser Trp Cys Leu Phe Gly 130 135 140 Pro Glu Leu Pro Glu Asp Pro Pro Ala Ala Pro Ala Thr Gln Arg Val 145 150 155 160 Leu Ser Pro Leu Met Ser Arg Ser Gly Cys Lys Val Gly Asp Ser Ser 165 170 175 Gly Thr Ala Ala Ala His Lys Val Leu Pro Arg Gly Leu Ser Pro Ala 180 185 190 Arg Gln Leu Leu Leu Pro Ala Ser Glu Ser Pro His Trp Ser Gly Ala 195 200 205 Pro Val Lys Pro Ser Pro Gln Ala Ala Ala Val Glu Val Glu Glu Glu 210 215 220 Asp Gly Ser Glu Ser Glu Glu Ser Ala Gly Pro Leu Leu Lys Gly Lys 225 230 235 240 Pro Arg Ala Leu Gly Gly Ala Ala Ala Gly Gly Gly Ala Ala Ala Val 245 250 255 Pro Pro Gly Ala Ala Ala Gly Gly Val Ala Leu Val Pro Lys Glu Asp 260 265 270 Ser Arg Phe Ser Ala Pro Arg Val Ala Leu Val Glu Gln Asp Ala Pro 275 280 285 Met Ala Pro Gly Arg Ser Pro Leu Ala Thr Thr Val Met Asp Phe Ile 290 295 300 His Val Pro Ile Leu Pro Leu Asn His Ala Leu Leu Ala Ala Arg Thr 305 310 315 320 Arg Gln Leu Leu Glu Asp Glu Ser Tyr Asp Gly Gly Ala Gly Ala Ala 325 330 335 Ser Ala Phe Ala Pro Pro Arg Ser Ser Pro Cys Ala Ser Ser Thr Pro 340 345 350 Val Ala Val Gly Asp Phe Pro Asp Cys Ala Tyr Pro Pro Asp Ala Glu 355 360 365 Pro Lys Asp Asp Ala Tyr Pro Leu Tyr Ser Asp Phe Gln Pro Pro Ala 370 375 380 Leu Lys Ile Lys Glu Glu Glu Glu Gly Ala Glu Ala Ser Ala Arg Ser 385 390 395 400 Pro Arg Ser Tyr Leu Val Ala Gly Ala Asn Pro Ala Ala Phe Pro Asp 405 410 415 Phe Pro Leu Gly Pro Pro Pro Pro Leu Pro Pro Arg Ala Thr Pro Ser 420 425 430 Arg Pro Gly Glu Ala Ala Val Thr Ala Ala Pro Ala Ser Ala Ser Val 435 440 445 Ser Ser Ala Ser Ser Ser Gly Ser Thr Leu Glu Cys Ile Leu Tyr Lys 450 455 460 Ala Glu Gly Ala Pro Pro Gln Gln Gly Pro Phe Ala Pro Pro Pro Cys 465 470 475 480 Lys Ala Pro Gly Ala Ser Gly Cys Leu Leu Pro Arg Asp Gly Leu Pro 485 490 495 Ser Thr Ser Ala Ser Ala Ala Ala Ala Gly Ala Ala Pro Ala Leu Tyr 500 505 510 Pro Ala Leu Gly Leu Asn Gly Leu Pro Gln Leu Gly Tyr Gln Ala Ala 515 520 525 Val Leu Lys Glu Gly Leu Pro Gln Val Tyr Pro Pro Tyr Leu Asn Tyr 530 535 540 Leu Arg Pro Asp Ser Glu Ala Ser Gln Ser Pro Gln Tyr Ser Phe Glu 545 550 555 560 Ser Leu Pro Gln Lys Ile Cys Leu Ile Cys Gly Asp Glu Ala Ser Gly 565 570 575 Cys His Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys 580 585 590 Arg Ala Met Glu Gly Gln His Asn Tyr Leu Cys Ala Gly Arg Asn Asp 595 600 605 Cys Ile Val Asp Lys Ile Arg Arg Lys Asn Cys Pro Ala Cys Arg Leu 610 615 620 Arg Lys Cys Cys Gln Ala Gly Met Val Leu Gly Gly Arg Lys Phe Lys 625 630 635 640 Lys Phe Asn Lys Val Arg Val Val Arg Ala Leu Asp Ala Val Ala Leu 645 650 655 Pro Gln Pro Val Gly Val Pro Asn Glu Ser Gln Ala Leu Ser Gln Arg 660 665 670 Phe Thr Phe Ser Pro Gly Gln Asp Ile Gln Leu Ile Pro Pro Leu Ile 675 680 685 Asn Leu Leu Met Ser Ile Glu Pro Asp Val Ile Tyr Ala Gly His Asp 690 695 700 Asn Thr Lys Pro Asp Thr Ser Ser Ser Leu Leu Thr Ser Leu Asn Gln 705 710 715 720 Leu Gly Glu Arg Gln Leu Leu Ser Val Val Lys Trp Ser Lys Ser Leu 725 730 735 Pro Gly Phe Arg Asn Leu His Ile Asp Asp Gln Ile Thr Leu Ile Gln 740 745 750 Tyr Ser Trp Met Ser Leu Met Val Phe Gly Leu Gly Trp Arg Ser Tyr 755 760 765 Lys His Val Ser Gly Gln Met Leu Tyr Phe Ala Pro Asp Leu Ile Leu 770 775 780 Asn Glu Gln Arg Met Lys Glu Ser Ser Phe Tyr Ser Leu Cys Leu Thr 785 790 795 800 Met Trp Gln Ile Pro Gln Glu Phe Val Lys Leu Gln Val Ser Gln Glu 805 810 815 Glu Phe Leu Cys Met Lys Val Leu Leu Leu Leu Asn Thr Ile Pro Leu 820 825 830 Glu Gly Leu Arg Ser Gln Thr Gln Phe Glu Glu Met Arg Ser Ser Tyr 835 840 845 Ile Arg Glu Leu Ile Lys Ala Ile Gly Leu Arg Gln Lys Gly Val Val 850 855 860 Ser Ser Ser Gln Arg Phe Tyr Gln Leu Thr Lys Leu Leu Asp Asn Leu 865 870 875 880 His Asp Leu Val Lys Gln Leu His Leu Tyr Cys Leu Asn Thr Phe Ile 885 890 895 Gln Ser Arg Ala Leu Ser Val Glu Phe Pro Glu Met Met Ser Glu Val 900 905 910 Ile Ala Ala Gln Leu Pro Lys Ile Leu Ala Gly Met Val Lys Pro Leu 915 920 925 Leu Phe His Lys Lys 930 5 1725 DNA Artificial Sequence Fusion protein of bacteriophage P1-Cre19V336A- PR676-914 5 atg ggc gcc acg agt gat gag gtt cgc aag aac ctg atg gac atg ttc 48 Met Gly Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe 1 5 10 15 agg gat cgc cag gcg ttt tct gag cat acc tgg aaa atg ctt ctg tcc 96 Arg Asp Arg Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser 20 25 30 gtt tgc cgg tcg tgg gcg gca tgg tgc aag ttg aat aac cgg aaa tgg 144 Val Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp 35 40 45 ttt ccc gca gaa cct gaa gat gtt cgc gat tat ctt cta tat ctt cag 192 Phe Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln 50 55 60 gcg cgc ggt ctg gca gta aaa act atc cag caa cat ttg ggc cag cta 240 Ala Arg Gly Leu Ala Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu 65 70 75 80 aac atg ctt cat cgt cgg tcc ggg ctg cca cga cca agt gac agc aat 288 Asn Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn 85 90 95 gct gtt tca ctg gtt atg cgg cgg atc cga aaa gaa aac gtt gat gcc 336 Ala Val Ser Leu Val Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala 100 105 110 ggt gaa cgt gca aaa cag gct cta gcg ttc gaa cgc act gat ttc gac 384 Gly Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp 115 120 125 cag gtt cgt tca ctc atg gaa aat agc gat cgc tgc cag gat ata cgt 432 Gln Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg 130 135 140 aat ctg gca ttt ctg ggg att gct tat aac acc ctg tta cgt ata gcc 480 Asn Leu Ala Phe Leu Gly Ile Ala Tyr Asn Thr Leu Leu Arg Ile Ala 145 150 155 160 gaa att gcc agg atc agg gtt aaa gat atc tca cgt act gac ggt ggg 528 Glu Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly 165 170 175 aga atg tta atc cat att ggc aga acg aaa acg ctg gtt agc acc gca 576 Arg Met Leu Ile His Ile Gly Arg Thr Lys Thr Leu Val Ser Thr Ala 180 185 190 ggt gta gag aag gca ctt agc ctg ggg gta act aaa ctg gtc gag cga 624 Gly Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg 195 200 205 tgg att tcc gtc tct ggt gta gct gat gat ccg aat aac tac ctg ttt 672 Trp Ile Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe 210 215 220 tgc cgg gtc aga aaa aat ggt gtt gcc gcg cca tct gcc acc agc cag 720 Cys Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gln 225 230 235 240 cta tca act cgc gcc ctg gaa ggg att ttt gaa gca act cat cga ttg 768 Leu Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu Ala Thr His Arg Leu 245 250 255 att tac ggc gct aag gat gac tct ggt cag aga tac ctg gcc tgg tct 816 Ile Tyr Gly Ala Lys Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser 260 265 270 gga cac agt gcc cgt gtc gga gcc gcg cga gat atg gcc cgc gct gga 864 Gly His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly 275 280 285 gtt tca ata ccg gag atc atg caa gct ggt ggc tgg acc aat gta aat 912 Val Ser Ile Pro Glu Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn 290 295 300 att gtc atg aac tat atc cgt aac ctg gat agt gaa aca ggg gca atg 960 Ile Val Met Asn Tyr Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Met 305 310 315 320 gcg cgc ctg ctg gaa gat ggc gat ggc gcc tca cca ggt caa gac ata 1008 Ala Arg Leu Leu Glu Asp Gly Asp Gly Ala Ser Pro Gly Gln Asp Ile 325 330 335 cag ttg att cca cca ctg atc aac ctg tta atg agc att gaa cca gat 1056 Gln Leu Ile Pro Pro Leu Ile Asn Leu Leu Met Ser Ile Glu Pro Asp 340 345 350 gtg atc tat gca gga cat gac aac aca aaa cct gac acc tcc agt tct 1104 Val Ile Tyr Ala Gly His Asp Asn Thr Lys Pro Asp Thr Ser Ser Ser 355 360 365 ttg ctg aca agt ctt aat caa cta ggc gag agg caa ctt ctt tca gta 1152 Leu Leu Thr Ser Leu Asn Gln Leu Gly Glu Arg Gln Leu Leu Ser Val 370 375 380 gtc aag tgg tct aaa tca ttg cca ggt ttt cga aac tta cat att gat 1200 Val Lys Trp Ser Lys Ser Leu Pro Gly Phe Arg Asn Leu His Ile Asp 385 390 395 400 gac cag ata act ctc att cag tat tct tgg atg agc tta atg gtg ttt 1248 Asp Gln Ile Thr Leu Ile Gln Tyr Ser Trp Met Ser Leu Met Val Phe 405 410 415 ggt cta gga tgg aga tcc tac aaa cac gtc agt ggg cag atg ctg tat 1296 Gly Leu Gly Trp Arg Ser Tyr Lys His Val Ser Gly Gln Met Leu Tyr 420 425 430 ttt gca cct gat cta ata cta aat gaa cag cgg atg aaa gaa tca tca 1344 Phe Ala Pro Asp Leu Ile Leu Asn Glu Gln Arg Met Lys Glu Ser Ser 435 440 445 ttc tat tca tta tgc ctt acc atg tgg cag atc cca cag gag ttt gtc 1392 Phe Tyr Ser Leu Cys Leu Thr Met Trp Gln Ile Pro Gln Glu Phe Val 450 455 460 aag ctt caa gtt agc caa gaa gag ttc ctc tgt atg aaa gta ttg tta 1440 Lys Leu Gln Val Ser Gln Glu Glu Phe Leu Cys Met Lys Val Leu Leu 465 470 475 480 ctt ctt aat aca att cct ttg gaa ggg cta cga agt caa acc cag ttt 1488 Leu Leu Asn Thr Ile Pro Leu Glu Gly Leu Arg Ser Gln Thr Gln Phe 485 490 495 gag gag atg agg tca agc tac att aga gag ctc atc aag gca att ggt 1536 Glu Glu Met Arg Ser Ser Tyr Ile Arg Glu Leu Ile Lys Ala Ile Gly 500 505 510 ttg agg caa aaa gga gtt gtg tcg agc tca cag cgt ttc tat caa ctt 1584 Leu Arg Gln Lys Gly Val Val Ser Ser Ser Gln Arg Phe Tyr Gln Leu 515 520 525 aca aaa ctt ctt gat aac ttg cat gat ctt gtc aaa caa ctt cat ctg 1632 Thr Lys Leu Leu Asp Asn Leu His Asp Leu Val Lys Gln Leu His Leu 530 535 540 tac tgc ttg aat aca ttt atc cag tcc cgg gca ctg agt gtt gaa ttt 1680 Tyr Cys Leu Asn Thr Phe Ile Gln Ser Arg Ala Leu Ser Val Glu Phe 545 550 555 560 cca gaa atg atg tct gaa gtt att gct atg cat gcg tac gga tag 1725 Pro Glu Met Met Ser Glu Val Ile Ala Met His Ala Tyr Gly 565 570 6 574 PRT Artificial Sequence Fusion protein of bacteriophage P1-Cre19V336A- PR676-914 6 Met Gly Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe 1 5 10 15 Arg Asp Arg Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser 20 25 30 Val Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp 35 40 45 Phe Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln 50 55 60 Ala Arg Gly Leu Ala Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu 65 70 75 80 Asn Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn 85 90 95 Ala Val Ser Leu Val Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala 100 105 110 Gly Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp 115 120 125 Gln Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg 130 135 140 Asn Leu Ala Phe Leu Gly Ile Ala Tyr Asn Thr Leu Leu Arg Ile Ala 145 150 155 160 Glu Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly 165 170 175 Arg Met Leu Ile His Ile Gly Arg Thr Lys Thr Leu Val Ser Thr Ala 180 185 190 Gly Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg 195 200 205 Trp Ile Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe 210 215 220 Cys Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gln 225 230 235 240 Leu Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu Ala Thr His Arg Leu 245 250 255 Ile Tyr Gly Ala Lys Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser 260 265 270 Gly His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly 275 280 285 Val Ser Ile Pro Glu Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn 290 295 300 Ile Val Met Asn Tyr Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Met 305 310 315 320 Ala Arg Leu Leu Glu Asp Gly Asp Gly Ala Ser Pro Gly Gln Asp Ile 325 330 335 Gln Leu Ile Pro Pro Leu Ile Asn Leu Leu Met Ser Ile Glu Pro Asp 340 345 350 Val Ile Tyr Ala Gly His Asp Asn Thr Lys Pro Asp Thr Ser Ser Ser 355 360 365 Leu Leu Thr Ser Leu Asn Gln Leu Gly Glu Arg Gln Leu Leu Ser Val 370 375 380 Val Lys Trp Ser Lys Ser Leu Pro Gly Phe Arg Asn Leu His Ile Asp 385 390 395 400 Asp Gln Ile Thr Leu Ile Gln Tyr Ser Trp Met Ser Leu Met Val Phe 405 410 415 Gly Leu Gly Trp Arg Ser Tyr Lys His Val Ser Gly Gln Met Leu Tyr 420 425 430 Phe Ala Pro Asp Leu Ile Leu Asn Glu Gln Arg Met Lys Glu Ser Ser 435 440 445 Phe Tyr Ser Leu Cys Leu Thr Met Trp Gln Ile Pro Gln Glu Phe Val 450 455 460 Lys Leu Gln Val Ser Gln Glu Glu Phe Leu Cys Met Lys Val Leu Leu 465 470 475 480 Leu Leu Asn Thr Ile Pro Leu Glu Gly Leu Arg Ser Gln Thr Gln Phe 485 490 495 Glu Glu Met Arg Ser Ser Tyr Ile Arg Glu Leu Ile Lys Ala Ile Gly 500 505 510 Leu Arg Gln Lys Gly Val Val Ser Ser Ser Gln Arg Phe Tyr Gln Leu 515 520 525 Thr Lys Leu Leu Asp Asn Leu His Asp Leu Val Lys Gln Leu His Leu 530 535 540 Tyr Cys Leu Asn Thr Phe Ile Gln Ser Arg Ala Leu Ser Val Glu Phe 545 550 555 560 Pro Glu Met Met Ser Glu Val Ile Ala Met His Ala Tyr Gly 565 570 7 1725 DNA Artificial Sequence Fusion protein hCre19V336A-PR676-914 7 atg ggt gcc acc tct gat gaa gtc agg aag aac ctg atg gac atg ttc 48 Met Gly Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe 1 5 10 15 agg gac agg cag gcc ttc tct gaa cac acc tgg aag atg ctc ctg tct 96 Arg Asp Arg Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser 20 25 30 gtg tgc aga tcc tgg gct gcc tgg tgc aag ctg aac aac agg aaa tgg 144 Val Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp 35 40 45 ttc cct gct gaa cct gag gat gtg agg gac tac ctc ctg tac ctg caa 192 Phe Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln 50 55 60 gcc aga ggc ctg gct gtg aag acc atc caa cag cac ctg ggc cag ctc 240 Ala Arg Gly Leu Ala Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu 65 70 75 80 aac atg ctg cac agg aga tct ggc ctg cct cgc cct tct gac tcc aat 288 Asn Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn 85 90 95 gct gtg tcc ctg gtg atg agg aga atc aga aag gag aat gtg gat gct 336 Ala Val Ser Leu Val Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala 100 105 110 ggg gag aga gcc aag cag gcc ctg gcc ttt gaa cgc act gac ttt gac 384 Gly Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp 115 120 125 caa gtc aga tcc ctg atg gag aac tct gac aga tgc cag gac atc agg 432 Gln Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg 130 135 140 aac ctg gcc ttc ctg ggc att gcc tac aac acc ctg ctg cgc att gcc 480 Asn Leu Ala Phe Leu Gly Ile Ala Tyr Asn Thr Leu Leu Arg Ile Ala 145 150 155 160 gaa att gcc aga atc aga gtg aag gac atc tcc cgc acc gat ggt ggg 528 Glu Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly 165 170 175 aga atg ctg atc cac att ggc agg acc aag acc ctg gtg tcc aca gct 576 Arg Met Leu Ile His Ile Gly Arg Thr Lys Thr Leu Val Ser Thr Ala 180 185 190 ggt gtg gag aag gcc ctg tcc ctg ggg gtt acc aag ctg gtg gag aga 624 Gly Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg 195 200 205 tgg atc tct gtg tct ggt gtg gct gat gac ccc aac aac tac ctg ttc 672 Trp Ile Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe 210 215 220 tgc cgg gtc aga aag aat ggt gtg gct gcc cct tct gcc acc tcc caa 720 Cys Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gln 225 230 235 240 ctg tcc acc cgg gcc ctg gaa ggg atc ttt gag gcc acc cac cgc ctg 768 Leu Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu Ala Thr His Arg Leu 245 250 255 atc tat ggt gcc aag gat gac tct ggg cag aga tac ctg gcc tgg tct 816 Ile Tyr Gly Ala Lys Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser 260 265 270 ggc cac tct gcc aga gtg ggt gct gcc agg gac atg gcc agg gct ggt 864 Gly His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly 275 280 285 gtg tcc atc cct gaa atc atg cag gct ggt ggc tgg acc aat gtg aac 912 Val Ser Ile Pro Glu Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn 290 295 300 att gtg atg aac tac atc aga aac ctg gac tct gag act ggg gcc atg 960 Ile Val Met Asn Tyr Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Met 305 310 315 320 gcg agg ctg ctc gag gat ggg gac ggc gcc tca cca ggt caa gac ata 1008 Ala Arg Leu Leu Glu Asp Gly Asp Gly Ala Ser Pro Gly Gln Asp Ile 325 330 335 cag ttg att cca cca ctg atc aac ctg tta atg agc att gaa cca gat 1056 Gln Leu Ile Pro Pro Leu Ile Asn Leu Leu Met Ser Ile Glu Pro Asp 340 345 350 gtg atc tat gca gga cat gac aac aca aaa cct gac acc tcc agt tct 1104 Val Ile Tyr Ala Gly His Asp Asn Thr Lys Pro Asp Thr Ser Ser Ser 355 360 365 ttg ctg aca agt ctt aat caa cta ggc gag agg caa ctt ctt tca gta 1152 Leu Leu Thr Ser Leu Asn Gln Leu Gly Glu Arg Gln Leu Leu Ser Val 370 375 380 gtc aag tgg tct aaa tca ttg cca ggt ttt cga aac tta cat att gat 1200 Val Lys Trp Ser Lys Ser Leu Pro Gly Phe Arg Asn Leu His Ile Asp 385 390 395 400 gac cag ata act ctc att cag tat tct tgg atg agc tta atg gtg ttt 1248 Asp Gln Ile Thr Leu Ile Gln Tyr Ser Trp Met Ser Leu Met Val Phe 405 410 415 ggt cta gga tgg aga tcc tac aaa cac gtc agt ggg cag atg ctg tat 1296 Gly Leu Gly Trp Arg Ser Tyr Lys His Val Ser Gly Gln Met Leu Tyr 420 425 430 ttt gca cct gat cta ata cta aat gaa cag cgg atg aaa gaa tca tca 1344 Phe Ala Pro Asp Leu Ile Leu Asn Glu Gln Arg Met Lys Glu Ser Ser 435 440 445 ttc tat tca tta tgc ctt acc atg tgg cag atc cca cag gag ttt gtc 1392 Phe Tyr Ser Leu Cys Leu Thr Met Trp Gln Ile Pro Gln Glu Phe Val 450 455 460 aag ctt caa gtt agc caa gaa gag ttc ctc tgt atg aaa gta ttg tta 1440 Lys Leu Gln Val Ser Gln Glu Glu Phe Leu Cys Met Lys Val Leu Leu 465 470 475 480 ctt ctt aat aca att cct ttg gaa ggg cta cga agt caa acc cag ttt 1488 Leu Leu Asn Thr Ile Pro Leu Glu Gly Leu Arg Ser Gln Thr Gln Phe 485 490 495 gag gag atg agg tca agc tac att aga gag ctc atc aag gca att ggt 1536 Glu Glu Met Arg Ser Ser Tyr Ile Arg Glu Leu Ile Lys Ala Ile Gly 500 505 510 ttg agg caa aaa gga gtt gtg tcg agc tca cag cgt ttc tat caa ctt 1584 Leu Arg Gln Lys Gly Val Val Ser Ser Ser Gln Arg Phe Tyr Gln Leu 515 520 525 aca aaa ctt ctt gat aac ttg cat gat ctt gtc aaa caa ctt cat ctg 1632 Thr Lys Leu Leu Asp Asn Leu His Asp Leu Val Lys Gln Leu His Leu 530 535 540 tac tgc ttg aat aca ttt atc cag tcc cgg gca ctg agt gtt gaa ttt 1680 Tyr Cys Leu Asn Thr Phe Ile Gln Ser Arg Ala Leu Ser Val Glu Phe 545 550 555 560 cca gaa atg atg tct gaa gtt att gct atg cat gcg tac gga tag 1725 Pro Glu Met Met Ser Glu Val Ile Ala Met His Ala Tyr Gly 565 570 8 574 PRT Artificial Sequence Fusion protein hCre19V336A-PR676-914 8 Met Gly Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe 1 5 10 15 Arg Asp Arg Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser 20 25 30 Val Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp 35 40 45 Phe Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln 50 55 60 Ala Arg Gly Leu Ala Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu 65 70 75 80 Asn Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn 85 90 95 Ala Val Ser Leu Val Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala 100 105 110 Gly Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp 115 120 125 Gln Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg 130 135 140 Asn Leu Ala Phe Leu Gly Ile Ala Tyr Asn Thr Leu Leu Arg Ile Ala 145 150 155 160 Glu Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly 165 170 175 Arg Met Leu Ile His Ile Gly Arg Thr Lys Thr Leu Val Ser Thr Ala 180 185 190 Gly Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg 195 200 205 Trp Ile Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe 210 215 220 Cys Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gln 225 230 235 240 Leu Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu Ala Thr His Arg Leu 245 250 255 Ile Tyr Gly Ala Lys Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser 260 265 270 Gly His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly 275 280 285 Val Ser Ile Pro Glu Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn 290 295 300 Ile Val Met Asn Tyr Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Met 305 310 315 320 Ala Arg Leu Leu Glu Asp Gly Asp Gly Ala Ser Pro Gly Gln Asp Ile 325 330 335 Gln Leu Ile Pro Pro Leu Ile Asn Leu Leu Met Ser Ile Glu Pro Asp 340 345 350 Val Ile Tyr Ala Gly His Asp Asn Thr Lys Pro Asp Thr Ser Ser Ser 355 360 365 Leu Leu Thr Ser Leu Asn Gln Leu Gly Glu Arg Gln Leu Leu Ser Val 370 375 380 Val Lys Trp Ser Lys Ser Leu Pro Gly Phe Arg Asn Leu His Ile Asp 385 390 395 400 Asp Gln Ile Thr Leu Ile Gln Tyr Ser Trp Met Ser Leu Met Val Phe 405 410 415 Gly Leu Gly Trp Arg Ser Tyr Lys His Val Ser Gly Gln Met Leu Tyr 420 425 430 Phe Ala Pro Asp Leu Ile Leu Asn Glu Gln Arg Met Lys Glu Ser Ser 435 440 445 Phe Tyr Ser Leu Cys Leu Thr Met Trp Gln Ile Pro Gln Glu Phe Val 450 455 460 Lys Leu Gln Val Ser Gln Glu Glu Phe Leu Cys Met Lys Val Leu Leu 465 470 475 480 Leu Leu Asn Thr Ile Pro Leu Glu Gly Leu Arg Ser Gln Thr Gln Phe 485 490 495 Glu Glu Met Arg Ser Ser Tyr Ile Arg Glu Leu Ile Lys Ala Ile Gly 500 505 510 Leu Arg Gln Lys Gly Val Val Ser Ser Ser Gln Arg Phe Tyr Gln Leu 515 520 525 Thr Lys Leu Leu Asp Asn Leu His Asp Leu Val Lys Gln Leu His Leu 530 535 540 Tyr Cys Leu Asn Thr Phe Ile Gln Ser Arg Ala Leu Ser Val Glu Phe 545 550 555 560 Pro Glu Met Met Ser Glu Val Ile Ala Met His Ala Tyr Gly 565 570 9 1800 DNA Artificial Sequence Fusion protein hCre19V336A-PR650-914 9 atg ggt gcc acc tct gat gaa gtc agg aag aac ctg atg gac atg ttc 48 Met Gly Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe 1 5 10 15 agg gac agg cag gcc ttc tct gaa cac acc tgg aag atg ctc ctg tct 96 Arg Asp Arg Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser 20 25 30 gtg tgc aga tcc tgg gct gcc tgg tgc aag ctg aac aac agg aaa tgg 144 Val Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp 35 40 45 ttc cct gct gaa cct gag gat gtg agg gac tac ctc ctg tac ctg caa 192 Phe Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln 50 55 60 gcc aga ggc ctg gct gtg aag acc atc caa cag cac ctg ggc cag ctc 240 Ala Arg Gly Leu Ala Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu 65 70 75 80 aac atg ctg cac agg aga tct ggc ctg cct cgc cct tct gac tcc aat 288 Asn Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn 85 90 95 gct gtg tcc ctg gtg atg agg aga atc aga aag gag aat gtg gat gct 336 Ala Val Ser Leu Val Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala 100 105 110 ggg gag aga gcc aag cag gcc ctg gcc ttt gaa cgc act gac ttt gac 384 Gly Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp 115 120 125 caa gtc aga tcc ctg atg gag aac tct gac aga tgc cag gac atc agg 432 Gln Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg 130 135 140 aac ctg gcc ttc ctg ggc att gcc tac aac acc ctg ctg cgc att gcc 480 Asn Leu Ala Phe Leu Gly Ile Ala Tyr Asn Thr Leu Leu Arg Ile Ala 145 150 155 160 gaa att gcc aga atc aga gtg aag gac atc tcc cgc acc gat ggt ggg 528 Glu Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly 165 170 175 aga atg ctg atc cac att ggc agg acc aag acc ctg gtg tcc aca gct 576 Arg Met Leu Ile His Ile Gly Arg Thr Lys Thr Leu Val Ser Thr Ala 180 185 190 ggt gtg gag aag gcc ctg tcc ctg ggg gtt acc aag ctg gtg gag aga 624 Gly Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg 195 200 205 tgg atc tct gtg tct ggt gtg gct gat gac ccc aac aac tac ctg ttc 672 Trp Ile Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe 210 215 220 tgc cgg gtc aga aag aat ggt gtg gct gcc cct tct gcc acc tcc caa 720 Cys Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gln 225 230 235 240 ctg tcc acc cgg gcc ctg gaa ggg atc ttt gag gcc acc cac cgc ctg 768 Leu Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu Ala Thr His Arg Leu 245 250 255 atc tat ggt gcc aag gat gac tct ggg cag aga tac ctg gcc tgg tct 816 Ile Tyr Gly Ala Lys Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser 260 265 270 ggc cac tct gcc aga gtg ggt gct gcc agg gac atg gcc agg gct ggt 864 Gly His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly 275 280 285 gtg tcc atc cct gaa atc atg cag gct ggt ggc tgg acc aat gtg aac 912 Val Ser Ile Pro Glu Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn 290 295 300 att gtg atg aac tac atc aga aac ctg gac tct gag act ggg gcc atg 960 Ile Val Met Asn Tyr Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Met 305 310 315 320 gcg agg ctg ctc gag gat ggg gac ggc gcc ctg gat gct gtt gct ctc 1008 Ala Arg Leu Leu Glu Asp Gly Asp Gly Ala Leu Asp Ala Val Ala Leu 325 330 335 cca cag cca gtg ggc gtt cca aat gaa agc caa gcc cta agc cag aga 1056 Pro Gln Pro Val Gly Val Pro Asn Glu Ser Gln Ala Leu Ser Gln Arg 340 345 350 ttc act ttt tca cca ggt caa gac ata cag ttg att cca cca ctg atc 1104 Phe Thr Phe Ser Pro Gly Gln Asp Ile Gln Leu Ile Pro Pro Leu Ile 355 360 365 aac ctg tta atg agc att gaa cca gat gtg atc tat gca gga cat gac 1152 Asn Leu Leu Met Ser Ile Glu Pro Asp Val Ile Tyr Ala Gly His Asp 370 375 380 aac aca aaa cct gac acc tcc agt tct ttg ctg aca agt ctt aat caa 1200 Asn Thr Lys Pro Asp Thr Ser Ser Ser Leu Leu Thr Ser Leu Asn Gln 385 390 395 400 cta ggc gag agg caa ctt ctt tca gta gtc aag tgg tct aaa tca ttg 1248 Leu Gly Glu Arg Gln Leu Leu Ser Val Val Lys Trp Ser Lys Ser Leu 405 410 415 cca ggt ttt cga aac tta cat att gat gac cag ata act ctc att cag 1296 Pro Gly Phe Arg Asn Leu His Ile Asp Asp Gln Ile Thr Leu Ile Gln 420 425 430 tat tct tgg atg agc tta atg gtg ttt ggt cta gga tgg aga tcc tac 1344 Tyr Ser Trp Met Ser Leu Met Val Phe Gly Leu Gly Trp Arg Ser Tyr 435 440 445 aaa cac gtc agt ggg cag atg ctg tat ttt gca cct gat cta ata cta 1392 Lys His Val Ser Gly Gln Met Leu Tyr Phe Ala Pro Asp Leu Ile Leu 450 455 460 aat gaa cag cgg atg aaa gaa tca tca ttc tat tca tta tgc ctt acc 1440 Asn Glu Gln Arg Met Lys Glu Ser Ser Phe Tyr Ser Leu Cys Leu Thr 465 470 475 480 atg tgg cag atc cca cag gag ttt gtc aag ctt caa gtt agc caa gaa 1488 Met Trp Gln Ile Pro Gln Glu Phe Val Lys Leu Gln Val Ser Gln Glu 485 490 495 gag ttc ctc tgt atg aaa gta ttg tta ctt ctt aat aca att cct ttg 1536 Glu Phe Leu Cys Met Lys Val Leu Leu Leu Leu Asn Thr Ile Pro Leu 500 505 510 gaa ggg cta cga agt caa acc cag ttt gag gag atg agg tca agc tac 1584 Glu Gly Leu Arg Ser Gln Thr Gln Phe Glu Glu Met Arg Ser Ser Tyr 515 520 525 att aga gag ctc atc aag gca att ggt ttg agg caa aaa gga gtt gtg 1632 Ile Arg Glu Leu Ile Lys Ala Ile Gly Leu Arg Gln Lys Gly Val Val 530 535 540 tcg agc tca cag cgt ttc tat caa ctt aca aaa ctt ctt gat aac ttg 1680 Ser Ser Ser Gln Arg Phe Tyr Gln Leu Thr Lys Leu Leu Asp Asn Leu 545 550 555 560 cat gat ctt gtc aaa caa ctt cat ctg tac tgc ttg aat aca ttt atc 1728 His Asp Leu Val Lys Gln Leu His Leu Tyr Cys Leu Asn Thr Phe Ile 565 570 575 cag tcc cgg gca ctg agt gtt gaa ttt cca gaa atg atg tct gaa gtt 1776 Gln Ser Arg Ala Leu Ser Val Glu Phe Pro Glu Met Met Ser Glu Val 580 585 590 att gct atg cat gcg tac gga tag 1800 Ile Ala Met His Ala Tyr Gly 595 10 599 PRT Artificial Sequence Fusion protein hCre19V336A-PR650-914 10 Met Gly Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe 1 5 10 15 Arg Asp Arg Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser 20 25 30 Val Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp 35 40 45 Phe Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln 50 55 60 Ala Arg Gly Leu Ala Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu 65 70 75 80 Asn Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn 85 90 95 Ala Val Ser Leu Val Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala 100 105 110 Gly Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp 115 120 125 Gln Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg 130 135 140 Asn Leu Ala Phe Leu Gly Ile Ala Tyr Asn Thr Leu Leu Arg Ile Ala 145 150 155 160 Glu Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly 165 170 175 Arg Met Leu Ile His Ile Gly Arg Thr Lys Thr Leu Val Ser Thr Ala 180 185 190 Gly Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg 195 200 205 Trp Ile Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe 210 215 220 Cys Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gln 225 230 235 240 Leu Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu Ala Thr His Arg Leu 245 250 255 Ile Tyr Gly Ala Lys Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser 260 265 270 Gly His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly 275 280 285 Val Ser Ile Pro Glu Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn 290 295 300 Ile Val Met Asn Tyr Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Met 305 310 315 320 Ala Arg Leu Leu Glu Asp Gly Asp Gly Ala Leu Asp Ala Val Ala Leu 325 330 335 Pro Gln Pro Val Gly Val Pro Asn Glu Ser Gln Ala Leu Ser Gln Arg 340 345 350 Phe Thr Phe Ser Pro Gly Gln Asp Ile Gln Leu Ile Pro Pro Leu Ile 355 360 365 Asn Leu Leu Met Ser Ile Glu Pro Asp Val Ile Tyr Ala Gly His Asp 370 375 380 Asn Thr Lys Pro Asp Thr Ser Ser Ser Leu Leu Thr Ser Leu Asn Gln 385 390 395 400 Leu Gly Glu Arg Gln Leu Leu Ser Val Val Lys Trp Ser Lys Ser Leu 405 410 415 Pro Gly Phe Arg Asn Leu His Ile Asp Asp Gln Ile Thr Leu Ile Gln 420 425 430 Tyr Ser Trp Met Ser Leu Met Val Phe Gly Leu Gly Trp Arg Ser Tyr 435 440 445 Lys His Val Ser Gly Gln Met Leu Tyr Phe Ala Pro Asp Leu Ile Leu 450 455 460 Asn Glu Gln Arg Met Lys Glu Ser Ser Phe Tyr Ser Leu Cys Leu Thr 465 470 475 480 Met Trp Gln Ile Pro Gln Glu Phe Val Lys Leu Gln Val Ser Gln Glu 485 490 495 Glu Phe Leu Cys Met Lys Val Leu Leu Leu Leu Asn Thr Ile Pro Leu 500 505 510 Glu Gly Leu Arg Ser Gln Thr Gln Phe Glu Glu Met Arg Ser Ser Tyr 515 520 525 Ile Arg Glu Leu Ile Lys Ala Ile Gly Leu Arg Gln Lys Gly Val Val 530 535 540 Ser Ser Ser Gln Arg Phe Tyr Gln Leu Thr Lys Leu Leu Asp Asn Leu 545 550 555 560 His Asp Leu Val Lys Gln Leu His Leu Tyr Cys Leu Asn Thr Phe Ile 565 570 575 Gln Ser Arg Ala Leu Ser Val Glu Phe Pro Glu Met Met Ser Glu Val 580 585 590 Ile Ala Met His Ala Tyr Gly 595 11 81 DNA Artificial Sequence 3′-end of the Cre-coding sequence from bacteriophagen P1-Cre 11 aatgtaaata ttgtcatgaa ctatatccgt aacctggata gtgaaacagg ggcaatggtg 60 cgcctgctgg aagatggcga t 81 12 81 DNA Artificial Sequence 3′-end of the Cre-coding sequence from hCre 12 aatgtgaaca ttgtgatgaa ctacatcaga aacctggact ctgagactgg ggccatggtg 60 aggctgctcg aggatgggga c 81 13 81 DNA Artificial Sequence 3′-end of the Cre-coding sequence from P1- CreV336A 13 aatgtaaata ttgtcatgaa ctatatccgt aacctggata gtgaaacagg ggcaatggcg 60 cgcctgctgg aagatggcga t 81 14 81 DNA Artificial Sequence 3′-end of the Cre-coding sequence from hCreV336A 14 aatgtgaaca ttgtgatgaa ctacatcaga aacctggact ctgagactgg ggccatggcg 60 aggctgctcg aggatgggga c 81 15 31 DNA Artificial Sequence Primer 15 aaattcgtac gcatcgccat cttccagcag g 31 16 30 DNA Artificial Sequence Primer 16 aaattggcgc catcgccatc ttccagcagg 30 17 29 DNA Artificial Sequence Primer 17 aatttggcgc cgtccccatc ctcgagcag 29 18 44 DNA Artificial Sequence Primer 18 aaattggcgc catcgccatc ttccagcagg cgcgccattg cccc 44 19 44 DNA Artificial Sequence Primer 19 aaattggcgc cgtccccatc ctcgagcagc ctcgccatgg cccc 44 20 39 DNA Artificial Sequence Primer 20 gggggatcca ccatgggtgc ctccaacctg ctgactgtg 39 21 42 DNA Artificial Sequence Primer 21 tttaaggatc caccatgggt gccacgagtg atgaggttcg ca 42 22 32 DNA Artificial Sequence Primer 22 tttaacgtac ggcacgagtg atgaggttcg ca 32 23 30 DNA Artificial Sequence Primer 23 tttaaggcgc cacgagtgat gaggttcgca 30 24 1032 DNA Bacteriophage P1 CDS (1)..(1029) misc_feature (335)..(335) The ′Xaa′ at location 335 stands for Lys, Asn, Arg, Ser, Thr, Ile, Met, Glu, Asp, Gly, Ala, Val, Gln, His, Pro, Leu, Tyr, Trp, Cys, or Phe. 24 atg tcc aat tta ctg acc gta cac caa aat ttg cct gca tta ccg gtc 48 Met Ser Asn Leu Leu Thr Val His Gln Asn Leu Pro Ala Leu Pro Val 1 5 10 15 gat gca acg agt gat gag gtt cgc aag aac ctg atg gac atg ttc agg 96 Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 20 25 30 gat cgc cag gcg ttt tct gag cat acc tgg aaa atg ctt ctg tcc gtt 144 Asp Arg Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 35 40 45 tgc cgg tcg tgg gcg gca tgg tgc aag ttg aat aac cgg aaa tgg ttt 192 Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 50 55 60 ccc gca gaa cct gaa gat gtt cgc gat tat ctt cta tat ctt cag gcg 240 Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln Ala 65 70 75 80 cgc ggt ctg gca gta aaa act atc cag caa cat ttg ggc cag cta aac 288 Arg Gly Leu Ala Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu Asn 85 90 95 atg ctt cat cgt cgg tcc ggg ctg cca cga cca agt gac agc aat gct 336 Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala 100 105 110 gtt tca ctg gtt atg cgg cgg atc cga aaa gaa aac gtt gat gcc ggt 384 Val Ser Leu Val Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala Gly 115 120 125 gaa cgt gca aaa cag gct cta gcg ttc gaa cgc act gat ttc gac cag 432 Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gln 130 135 140 gtt cgt tca ctc atg gaa aat agc gat cgc tgc cag gat ata cgt aat 480 Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg Asn 145 150 155 160 ctg gca ttt ctg ggg att gct tat aac acc ctg tta cgt ata gcc gaa 528 Leu Ala Phe Leu Gly Ile Ala Tyr Asn Thr Leu Leu Arg Ile Ala Glu 165 170 175 att gcc agg atc agg gtt aaa gat atc tca cgt act gac ggt ggg aga 576 Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly Arg 180 185 190 atg tta atc cat att ggc aga acg aaa acg ctg gtt agc acc gca ggt 624 Met Leu Ile His Ile Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 195 200 205 gta gag aag gca ctt agc ctg ggg gta act aaa ctg gtc gag cga tgg 672 Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 210 215 220 att tcc gtc tct ggt gta gct gat gat ccg aat aac tac ctg ttt tgc 720 Ile Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 225 230 235 240 cgg gtc aga aaa aat ggt gtt gcc gcg cca tct gcc acc agc cag cta 768 Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gln Leu 245 250 255 tca act cgc gcc ctg gaa ggg att ttt gaa gca act cat cga ttg att 816 Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu Ala Thr His Arg Leu Ile 260 265 270 tac ggc gct aag gat gac tct ggt cag aga tac ctg gcc tgg tct gga 864 Tyr Gly Ala Lys Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser Gly 275 280 285 cac agt gcc cgt gtc gga gcc gcg cga gat atg gcc cgc gct gga gtt 912 His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 290 295 300 tca ata ccg gag atc atg caa gct ggt ggc tgg acc aat gta aat att 960 Ser Ile Pro Glu Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn Ile 305 310 315 320 gtc atg aac tat atc cgt aac ctg gat agt gaa aca ggg gca nnn nnn 1008 Val Met Asn Tyr Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Xaa Xaa 325 330 335 nnn ctg ctg gaa gat ggc gat tag 1032 Xaa Leu Leu Glu Asp Gly Asp 340 

1. A DNA sequence coding for a mutant of bacteriophage P1 recombinase Cre in which the cryptic splicing site in the sequence ATG GTG CGC, which corresponds to positions 1003-1011 in the wild-type sequence shown in SEQ ID NO: 1, has been eliminated by a base mutation changing the Cre protein sequence.
 2. The DNA sequence according to claim 1, wherein the codon GTG in said splicing site sequence has been replaced by a codon XYZ, in which X, Y and Z are independently the nucleotides A, T, C or G, provided that if X=G, then Y≠T.
 3. The DNA sequence according to claim 2, wherein codon XYZ codes for a neutral amino acid and, in particular, is the Ala-coding codon GCG.
 4. The DNA sequence according to one or more of claims 1 to 3, wherein other cryptic splicing sites present in the wild type sequence have been eliminated by silent mutation and/or by base mutations changing the Cre protein sequence.
 5. The DNA sequence according to any of claims 1 to 4, which is truncated at the 5′ terminus with respect to the Cre wild type sequence, especially a DNA sequence in which the nucleotides corresponding to positions 1 to 54 of the wild type are truncated.
 6. The DNA sequence according to any of claims 1 to 5, wherein the DNA sequence has nucleotides at the 5′ end which code for additional amino acids not present in the wild type Cre sequence.
 7. The DNA sequence according to claim 6, wherein said additional nucleotides code for neutral amino acids, especially Met, Gly, Val and Ala.
 8. The DNA sequence according to one or more of claims 1 to 7, wherein said DNA sequence is truncated at the 3′ terminus with respect to the wild type sequence.
 9. The DNA sequence according to claim 1, wherein said DNA sequence comprises the nucleotides 1 to 984 of SEQ ID NOS: 5, 7 or
 9. 10. A DNA sequence coding for a fusion protein derived from bacteriophage P1 recombinase Cre and another functional protein, wherein the sequence encoding bacteriophage P1 recombinase Cre is defined as in claims 1 to
 9. 11. The DNA sequence according to claim 10, wherein said functional protein is a ligand-binding domain of a receptor protein, the receptor being, in particular, a steroid receptor, preferably a progesterone, estrogen or glucocorticoid receptor.
 12. The DNA sequence according to claim 11, wherein said receptor is a progesterone receptor, especially a progesterone receptor which comprises the nucleotides 1948 to 2742 of SEQ ID NO:
 3. 13. The DNA sequence according to any of claims 10 to 12, wherein cryptic splicing sites within the sequence coding for the ligand-binding domain have been eliminated by silent mutation or by a base mutation changing the protein sequence of said ligand-binding domain.
 14. The DNA sequence according to claim 10, having the sequence shown in SEQ ID NOS: 5, 7 or
 9. 15. A vector containing a DNA sequence according to one or more of claims 1 to
 14. 16. A microorganism or transgenic organism containing a vector as defined in claim 15 and/or a DNA sequence as defined in claims 1 to
 14. 17. A Cre mutant or Cre fusion protein encoded by the DNA sequence according to claims 1 to
 14. 18. A process for preparing a Cre mutant or Cre fusion protein as defined in claim 17, comprising the culturing of a microorganism or a cell culture which has been transformed or transfected with a vector according to claim
 15. 19. Use of a DNA sequence according to claims 11 to 14 for the mutagenesis and/or recombination of target sequences containing loxP sites in organisms. 